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Thermodynamics describes large-scale, slowly evolving systems. Two modern approaches general¬ 
ize thermodynamics: fluctuation theorems, which concern finite-time nonequilibrium processes, and 
one-shot statistical mechanics, which concerns small scales and finite numbers of trials. Combining 
these approaches, we calculate a one-shot analog of the average dissipated work defined in fluctua¬ 
tion contexts: the cost of performing a protocol in finite time instead of quasistatically. The average 
dissipated work has been shown to be proportional to a relative entropy between phase-space den¬ 
sities, one between quantum states, and one between probability distributions over possible values 
of work. We derive one-shot analogs of all three equations, demonstrating that the order-oo Renyi 
divergence is proportional to the maximum dissipated work in each case. These one-shot analogs 
of fluctuation-theorem results contribute to the unification of these two toolkits for small-scale, 
nonequilibrium statistical physics. 


Thermodynamics concerns large scales and infinitesi¬ 
mally slow evolutions. In the thermodynamic limit, a 
system’s size approaches infinity and is typified by mean 
behaviors. Infinitesimally slow, quasistatic, processes are 
described with the free energy F, with temperature, and 
with other equilibrium quantities. 

Two recently developed frameworks generalize thermo¬ 
dynamic concepts, such as work and heat, beyond slow 
processes and infinite sizes. Fluctuation relations interre¬ 
late equilibrium quantities such as F with nonequilibrium 
processes (e.g., [il-@). One-shot statistical mechanics 
quantifies the efficiency with which work can be invested 
or extracted, not only on average as the number of trials 
approaches infinity, but also if few trials are performed 
(e.g., (MU). One-shot statistical mechanics grew from 
one-shot information theory (e.g., [l3l - [l^ ). the study of 
entropies apart from Shannon’s and von Neumann’s [l2j| , 
to describe protocols whose trials are not necessarily in¬ 
dependent and identically distributed according to the 
same probability distribution or quantum state. A com¬ 
bination of fluctuation relations and one-shot statistical 
mechanics describes quite general thermodynamic sys¬ 
tems [ 13 . 

Transforming one equilibrium state quasistatically into 
another requires an amount W of work equal to the differ¬ 
ence between the states’ free energies: W = AF. Imple¬ 
menting a protocol in finite time yields a nonequilibrium 
state and costs extra work, some dissipated as heat. This 
penalty of irreversibility is called the dissipated work, or 
irreversible work. The average (Wdiss) := {W) — AF 
over many trials has been studied in fluctuation contexts 
(e.g. [TsI - l^ lPI We define the one-shot dissipated work 
kbdiss ■=W — AF as the penalty paid in one trial [2l|. 


^ Our discussion of work can be phrased alternatively in terms of 
entropy production (e.g., 0 )- 


(IVdiss) has been shown to be proportional to three 
instances of the Fullback-Leibler (KL) divergence, or av¬ 
erage relative entropy, D. D quantifies how much two 
probability distributions, or two quantum states, differ. 
(Wdiss) has been related to a between phase-space den¬ 
sities p{p,q,t) and p(v,—q,t) Q, a between quantum 
states p(t) and p(t) [^, and a D between probability dis¬ 
tributions Pfwd(lV) and Prev{—W). We derive one-shot 
analogs of all three relationships. 

Renyi divergences have recently appeared in 
fluctuation-relation contexts [ 2 ^. The latter work 
pertains specifically to resource theories, which we will 
not use. We follow the approach of [ 13 , building on 
assumptions used to derive Crooks’ Theorem. 

We begin by reviewing fluctuation theorems and Renyi 
divergences, focusing on the one-shot order-oo Renyi di¬ 
vergence Dao- We recall each (Wdiss) proportionality and 
derive its one-shot analog. Our main results relate the 
maximum possible penalty of investing work in fi¬ 

nite time to three instances of Hoo- Our one-shot analogs 
of fluctuation-relation results illustrate the insights of¬ 
fered by merging fluctuation relations with one-shot sta¬ 
tistical mechanics. 

Fluctuation theorems —Consider a system governed 
by a time-dependent Hamiltonian The exter¬ 

nal parameter At changes in time: t G [— t, t]. Sup¬ 
pose the system begins in the thermal state 7 _t := 
wherein j3 denotes a heat bath’s inverse 
temperature and Z-t normalizes the state. Suppose an 
agent switches At from A_t- to Xr while the system inter¬ 
acts with the bath. The switching costs work, the amount 
of which varies from trial to trial. A probability distribu¬ 
tion Pfwd(W) represents the probability that a given trial 
costs work W. By Prev(—W), we denote the probability 
that initializing the Hamiltonian to H{\r) and initializ¬ 
ing the system in 7 ,- := /Zt, then reversing the 
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drive according to A_t, outputs work W. 

Fluctuation relations such as Crooks’ Theorem gov¬ 
ern these distributions [l^. Let AF := F(jr) — Fil-r) 
denote the difference between the free energy of Jt and 
that of 7 _t-- (Throughout this letter, we shall assume 
AF is finite.) Assuming the system is classical; coupled 
to a bath; and undergoing a Markovian, microscopically 
reversible evolution, Crooks proved that 

Pfwd(kF) _ p{W-AF) 

Pre.i-W) ~ ^ ’ 

[ 13 . Identical theorems have been shown to govern quan¬ 
tum systems isolated from Q, or interacting with the 
bath while work is performed (e.g., i). 

Renyi divergences — The order-a Renyi divergence 
quantifies the distinctness of probability distributions 
P{x) and Q{x) 

Da{P\\Q) ■■= In dxp°‘{x)q^~°‘{x) 

or of quantum states p and a [25l |: 

Da{p\\(j) ■■= In (Tr(p“cr^-“)) , (3) 

wherein Tr denotes the trace, for a € [0,1) U (1, oo). The 
order-1 Renyi divergence, known also as the KL diver¬ 
gence and the average relative entropy, follows from the 
limit as a —> 1: 

DiP\\Q) = J dx Pix)\n{Pix)/Qix)) (4) 

for classical distributions, and D{p\\cj) = Tr(p[ln(p) — 
In(cr)]) for quantum states. We will focus on the order- 
cx) divergences: 

Doo{P\\Q) = In ^min{A G R : P{x) < XQ{x) (5) 

for classical distributions, and 

A’oo(pIIo-) = In ^niax|^ : (ri|sj)7^o|^ (6) 

for quantum states p = SiF|F)(F| and 

= EjStlstXstI [13 ■ 

Divergences between phase-space densities— 

Kawai et al. consider a classical system that remains 
isolated from the bath while work is performed [3| . Gov¬ 
erned by Hamiltonian dynamics, the system follows a de¬ 
terministic trajectory through phase space. Specifying a 
phase-space point {q,p) at any time t uniquely specifies 
a trajectory and a work cost W{q,p,t). 

An experimenter does not know which trajectory the 
system follows in any given forward trial, because the 
experimenter ascribes to the system the initial state 
/Z-r- The probability that the system occu¬ 
pies an axea.-{dq dp) region centered on {q,p) at time t is 



p{q,p,t) dqdp, wherein p{q,p,t) denotes the phase-space 
density. p{q,p,t) denotes the phase-space density after 
an amount t = 2t — t of time has passed during the re¬ 
verse protocol. 

Kawai et al. proceed as follows. As the system 
loses no heat while work is performed, the work re¬ 
quired to evolve the system along some trajectory equals 
the difference between the final and initial Hamiltoni¬ 
ans: W(j>,q,t) = H{qr,Pr,T) - H{q_r,P-r,-T). The 
forward process’s initial p and the reverse process’s ini¬ 
tial p are equated with thermal states. The Hamilto¬ 
nian is assumed to have time-reversal invariance (TRI): 
H{q,p,t) = H{q, —p,t). From TRI, the preservation of 
phase-space densities by Hamiltonian dynamics, and the 
correspondence of p{q,p,t) and p(q,—p,t) to the same 
Hamiltonian follows the “generalized Crooks relation” 

^/3[W{q,p,t)-AF] ^ /yN 

p{.(i,-p,ty 

By taking logs, multiplying each side by pip, —p,t), and 
integrating over phase space, Kawai et al. derive 

(VFdiss) = ^D{p{q,p,t)\\piq,-p,t)). (8) 

The right-hand side (RHS) is well-defined if the sup¬ 
port of p lies in the support of p: supp( p((7,p, t)) C 
supp{ p{q,-p,t)) [13. 

The nonnegativity of D implies that, on average, per¬ 
forming a protocol quickly costs positive work. The 
work penalty’s nonnegativity has been interpreted as the 
Second Law of Thermodynamics @,[13. According to 
Stein’s Lemma, DiP\\Q) quantifies the average proba¬ 
bility that an attempt to distinguish between P and Q 
will fail [H,[13- T)(p(( 7 ,p, t)||p((7, —p, t)) quantifies the 
distinguishability of the forward-process density from its 
time-reverse. DiP\\Q) vanishes if and only if P = Q [13 . 
Equation (IlSp shows that reversing the trajectory fol¬ 
lowed during the forward protocol yields the trajectory 
followed during the reverse protocol if and only if the 
system dissipates no work on average. No work is dis¬ 
sipated if the process proceeds quasistatically, such that 
the system remains in equilibrium. Hence D quantifies 
roughly how far from equilibrium the system evolves. 

Let us turn from averages over infinitely many trials 
to single trials. 

Theorem 1. The worst-case dissipated work of the fore¬ 
going protocol is proportional to an order-oo Renyi diver¬ 
gence between phase-space distributions: 

= ^D^{piq,pMKQ,-p,t)), (9) 


ifsupp{piq,p,t)) C supp{piq,-p,t)). 


Proof. First, we take the logarithm of each side of the 
generalized Crooks relation [Eq. d?])]: 


IT - AF = - In 


/ piq,P:t) X 
\piq,-p,t)) 


( 10 ) 
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We maximize each side of the equation, invoking the log¬ 
arithm’s monotonicity to shift the maximum into the ar¬ 
gument: 

( 11 ) 

Comparing the left-hand side (LHS) with the definition 
of and the RHS with the definition of Doa yields 

Eq. □ 

Like Eq. ([8|), Theorem [T] relates dissipated work 
to a measure of the difference between p(p, q, t) and 
p{p, —q, t). The more work is dissipated during the most 
expensive possible trial, the less the forward-process den¬ 
sity can resemble its time-reversed cousin. The lesser the 
resemblance, the farther the system is expected to depart 
from equilibrium. As in Eq. ([8]), the LHS of Eq. m is 
time-independent, so the RHS remains constant for all 
t e 

Equation ([9]) has the correct quasistatic limit: If work 
is invested infinitesimally slowly, the worst amount of 
work that can be dissipated—the only amount that can 
be dissipated—vanishes: Wmax — AE = AF — AF = 0. 
Because the system remains in equilibrium, H[Xt) and 
P determine the state completely. The RHS of Ineq. (© 
becomes D{p{q,p,t)\\p{q, -p,t)) = 0. 

Theorem [1] can aid an agent who has imperfect infor¬ 
mation about phase-space densities. Kawai et al. rec¬ 
ommend using Eq. ([5]) to predict (Wdiss) from p and p. 
Phase-space densities, they acknowledge, can be difficult 
to learn about. So they bound (Wdiss) with a D between 
coarse-grained densities. Theorem [T] offers an alterna¬ 
tive to coarse-graining. One can use the theorem upon 
learning just the maximum of p/p, rather than the densi¬ 
ties’ precise forms. Instead of bounding (Wdiss): one can 
calculate a one-shot dissipated work exactly. 

Interchanging the arguments of D^o yields the worst- 
case forfeited work. One can extract less work by im¬ 
plementing the reverse protocol at finite speed than by 
implementing the protocol quasistatically, due to dissi¬ 
pation. The worst-case forfeited work 

WS := AE - W„,ax (12) 

is the most work an agent might sacrifice for time in any 
finite-speed reverse trial: 

W(fOTfeit = ^D^{p{q,-p,t)\\p{q,p,t)), ( 13 ) 

if supp(p((7,-p,t)) C supp{ p{q,p,t)). 

Divergences between quantum states —Parrondo et 
al. have quantized Eq. ([H]) [1^. They consider a quan¬ 
tum system governed by a quantum Hamiltonian H{Xt) 
specified by an external parameter At. Let p(t) denote 
the state occupied by the system at time t. In the for¬ 
ward protocol, the system begins in thermal equilibrium: 
p{—t) = /Z-r- During t € (—r,r), the system 


is isolated from the bath, and an agent invests work to 
switch At from A-,- to A^. The state changes unitarily. 
During the reverse protocol, the system is prepared in the 
state p{t) = e~^^^ /Zr', time runs from t = t to t = — t ; 
and work is extracted via the time-reversed schedule A_t. 

Assuming that supp( p{t) ) C supp( p{t) ), Parrondo et 
al. derive 

(Wdiss) = ^D{p{t)\\p{t)). (14) 

Recycling their set-up, we will prove a proportionality 
between the worst-case dissipated work and an order- 
oo Renyi divergence. We must define “work” explicitly. 
In some quantum fluctuation-relation contexts, work is 
defined in terms of two energy measurements [30l| : 
The system begins in the thermal state 7 -t- An en¬ 
ergy measurement at t = —r yields some eigenvalue Ei of 
Fl-r- The system is isolated from the bath, and the state 
evolves unitarily. An energy measurement at t = r yields 
some eigenvalue Ej of Hr. As the system exchanges no 
heat during the unitary evolution, the difference between 
the measurement outcomes equals the work performed: 
W = Ej- Ei. 

We assume that the agent does not learn the initial 
measurement’s outcome until the end of the protocol. Be¬ 
cause the state begins block-diagonal relative to the ini¬ 
tial Hamiltonian, this measure-and-forget operation pre¬ 
serves the initial state. 

Theorem 2. The worst-case work dissipated during any 
such quantum forward trial is 

WZr = ^D^{p{t)\\p{t))- ( 15 ) 

Proof Let p(t) = Z)* P*l*(^))(*(0l and p = 

'^jPj\j{t)){j{t)\ denote the states’ eigenvalue de¬ 
compositions. The eigenvalues, and the inner products 
{i(t)\j(t)), remain constant throughout the unitary 
evolution. Doo{p{t)\\p{t)) therefore remains constant. 
Without loss of generality, we can evaluate the definition 
[Eq. (I6|)] at t = r: 

Doo{p{t)\\p{t)) =ln|^inax||l : (f(r)|j(r)) 7^ o|^ ■ 

(16) 

Let U denote the unitary that evolves the initial state 
to the final in the forward process: p{t) = Up(—t)W. 
We can express the inner product as {i{—T)\U^\j{T)). 
The thermal natures of p{—t) and p{t) imply that pi = 
e~hEi/Z_r andpj = e~^^^ /Zr. Since Zr/Z-r = 

Eq. (fTBl) is equivalent to 

Doo{p(t)\p(t)) = In : 

(*(-r)|C/t|j(r))^0}). (17) 

The work dissipated in some forward trial is propor¬ 
tional to the exponential’s argument. The forward pro¬ 
tocol is unable to map |*(—t)) to |j(T)) if and only if 
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{i{—T)\W\j{T)) = 0, i.e., if and only if the condition in 
Eq . (llTp is violated. Hence the worst-case work that can 
be dissipated during any forward trial is proportional to 
exponential’s argument, maximized under the condition 
in Eq. (El). Rearranging Eq. ED yields Eq. (El)- □ 

The discussion of irreversibility, distinguishability, t- 
dependence, the quasistatic limit, and coarse-graining 
that characterizes the classical Theorem [T] characterizes 
also the quantum Theorem [21 is bounded when 

H-r and Hr have bounded spectra, as in many problems 
in one-shot statistical mechanics (e.g., EH)- 
Divergences between work distributions We have 
related dissipated work to a divergence Doo between 
phase-space densities and to a Doo between quantum 
states. We now relate to a Doo between distribu¬ 

tions over possible values of work. 

The Kullback-Leiber divergence between Pfwd(fE) and 
Prev{—W) is proportional to the average dissipated work: 

^D{P{^d(W)\\PreA-W)) = {W)fwd -AF = (Wdiss). 

( 18 ) 

The first equality follows from the substitution from 
Crooks’ Theorem [Eq. (P)] for Pfwd(lE)/Erev(—IE) in the 
definition of D( Pfwd(lE)||Prev(—bE)). We will derive a 
one-shot analog of Eq. ED- 

Theorem 3. The worst-case work that can be dissipated 
in any forward trial is proportional to the order-oo Renyi 
divergence between and Prev{—W): 

= ^i^oo(Pfwd(lE)||Prev(-VE) ), (19) 

if the set of possible work-values is bounded. 

Proof. By the definition of Doo, 

Doo{PtMW)\\PreA-W)) (20) 

= In (min {A S E : Pfwd(lE) < APrev(-lE) VW}). 


Let us solve for the minimal A-value Amin that satis¬ 
fies the inequality. Eirst, we check that we can divide 
the inequality by Prev(—bE). Crooks’ Theorem implies 
that Pfwd(lE) = e^(^“^^)Prev(—bE). By assumption, 
Pfwd(bE) and Prev(—bE) are nonzero only if bE is finite. 
Also, AP is finite. Hence Crooks’ Theorem implies that 
Prev(—bE) = 0 if and only if Pfwd(bE) = 0. In this case, 
the inequality becomes 0 < A • 0, which is satished by 
any finite A and so does not determine Amin- To solve for 
Amin, we can restrict our focus to Prev(—bE) ^ 0, then di¬ 
vide each side of the inequality in Eq. (1201) by Prev(—bE): 


\ ^ -Pfwd(bE) 

Vin > — . V IE. 


Prev(-bE) 


( 21 ) 


Substituting into the RHS from Crooks’ Theorem 
yields Amin > ^hiw-AF)^ bound saturates when W 


assumes its maximal value lEmax: Amin = = 

e/3M^dI°r*. Substituting into Eq. (12(11) yields Eq. (El). 

□ 

Just as ^P( Pfwd(IE)||Prev(—bE) ) equals the 
average, over many trials, of dissipated work, 
;5Pcx 3( Pfwd(bE)||Prev(—bE) ) equals the most work 
that could be dissipated in any trial. An agent can 
calculate this dissipated work upon inferring Pfwd and 
Prev from experimental or simulation statistics. 

Theorem [31 contains a Renyi divergence between work 
distributions, rather than a Doo between phase-space dis¬ 
tributions or a Doo between quantum states. Hence The- 
orem[3lgoverns more protocols than Theorems P and [21 as 
it describes all protocols—quantum or classical, regard¬ 
less of whether the system exchanges heat while work is 
performed—that obey Crooks’ Theorem. 

Interchanging the divergence’s arguments yields the 
worst-case forfeited work [Eq. (fT2l) ]: 

= ^i^oo(Prev(-bE)||Pfwd(bE)). (22) 

Outlook and discussion —We have developed one-shot 
analogs of three relationships between the average dissi¬ 
pated work (bEdiss) and an “average” Renyi divergence D. 
We related the worst-case dissipated work bEJ^”®* to an 
order-oo Renyi divergence Doo between classical phase- 
space distributions, between quantum states, and to a 
Doc between work distributions. In all three cases, the 
proportionality between the averages (bEdiss) and D also 
characterizes the one-shot quantities bEJ^”*** and Poo- 

The incorporation of risk tolerance into these results 
merits investigation. An agent can trade off the guar¬ 
antee that each trial will accomplish its purpose with 
the possibility of paying less work (or extracting more 
work) than by exerting caution. Risk tolerance can be 
quantified with a parameter e G [0,1]. This failure prob¬ 
ability, chosen by the agent, has been incorporated into 
Renyi divergences [l^ and one-shot statistical mechan¬ 
ics (e.g., [TJ, [2l|). The incorporation of e into the re¬ 
sults above, as well as the consideration of different-order 
Renyi divergences Da ^ Doo, should provide further in¬ 
sights into fluctuation relations via one-shot statistical 
mechanics. 

Note added —Lemma [3l appeared previously in an early 
draft of but was deleted from the manuscript. Theo¬ 
rems P and [21 have never, to our knowledge, appeared in 
the literature. 
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